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* activitated  in  memory  in  response  to  only  part  of  the  available  data.  This 
candidate  hypothesis  is  then  assumed  to  be  checked  for  consistency  against 


the  remaining  data.  This  latter  process  is  called  "consistency  checking 


Experiment  1 was  performed  to  provide  evidence  that  consistency  checking 


occurs  during  hypothesis  generation.  Subjects  who  retrieved  and  checked 


hypotheses  for  consistency  required  more  time  to  generate  a hypothesis  than 


subjects  who  just  retrieved  hypotheses.  Experiment  2 indicated  that  subjects 


performed  a task  analogous  to  the  consistency  checking  process  faster  than 


subjects  who  retrieved  and  checked  hypotheses  for  consistency.  Experiment  3 
was  performed  to  provide  evidence  that  consistency  checking  is  a self- 


terminating process.  Subjects'  latencies  depended  upon  the  position  of  a 


disconf irming  datum  within  a data  set,  supporting  this  conjecture.  Although 


some  of  the  predictions  in  experiment  1 were  not  supported,  the  results 


generally  confirm  the  existence  of  a high-speed  verification  process  in 


hypothesis  generation. 
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Consistency  checking  and  Wyeolhesig  Generation 

The  tern  “hypothesis  generation"  refers  to  the  generation  of  answers  or 
possible  explanations  to  account  for  a given  set  of  information  which  will  be 
referred  to  as  “data".  For  example,  a physician  generates  disease  hypotheses  in 
response  to  a patients  symptoms  and  the  results  of  diagnostic  tests.  Gettys  and 
Fisher  (in  press)  and  Gettys,  Fisher,  and  Kehle  < » 978 > have  developed  a 
tentative  model  cf  the  hypothesis  generation  process.  According  to  this  model, 
hypotheses  are  first  retrieved  or  activated  within  a semantic  memory  network 
similar  to  that  described  by  Collins  and  Loftus  <t?75).  Any  hypothesis  so 
retrieved  is  assumed  to  be  subject  to  some  form  of  plausibility  assessment 
which  may  range  from  simple  semantic  verification  to  complex  processes 
involving  probabl istic  inference,  depending  upon  task  demands  and  the 
importance  of  the  problem.  The  purpose  of  the  present  series  of  experiments  is 
to  provide  evidence  for  the  existence  of  a minimal  form  of  plausibility 
assessment  which  is  assumed  to  be  intimately  tied  to  the  retrieval  of 
hypotheses  in  multiple  data  problems. 

Gettys,  Fisher,  and  tlehle  (1978)  have  described  a study  which  estimated  the 
number  of  activation  tags  a hypothesis  node  receives  before  it  is  retrieved  as 
a possible  hypothesis.  For  problems  which  involved  three  or  six  data,  the 
hypotheses  were  estimated  to  be  retrieved  using  only  two  or  three  data, 


2 


respectively.  These  estimates  surest  that  so me  mechanism  or  process  mist  exist 
to  insure  that  the  retrieved  hypothesis  is  consistent  with  all  of  the  available 
data  rather  than  just  the  data  Iron  which  the  hypothesis  was  retrieved.  Ue  have 
naned  this  proposed  process  “consistency  checking"  and  have  made  three 
predictions  concerning  its  operation  in  the  overall  hypothesis  generation 
process.  First,  when  consistency  checking  occurs  in  hypothesis  generation,  this 
process  should  add  tine  required  to  generate  n consistent  hypothesis  beyond 
that  required  to  retrieve  a hypothesis  from  memory.  Secondly,  consistency 
checking  is  expected  to  be  a more  rapid  process  than  hypothesis  generation. 
Thirdly,  it  is  expected  that  consistency  checking  is  a self-terminating 
process . 


According  to  ou.  theoretical  analysis,  the  generation  of  hypotheses  which  are 
consistent  with  a set  of  data  involves  the  operation  of  independent  retrieval 
and  consistency  checking  processes.  The  retrieval  process  involves  activation 
of  potential  hypotheses  in  response  to  only  part  of  the  total  number  of  data 
presented  in  a decision  problem.  Any  retrieved  hypothesis  is  then  assorted  to  be 
checked  for  consistency  against  the  regaining  data.  If  the  retrieved  hypothesis 
is  found  to  be  consistent  with  the  remaining  data,  it  will  be  emitted  as  a 
response.  However,  if  any  datum  is  found  to  be  inconsistent  with  the  retrieved 
hypothesis,  the  entire  process  will  be  repeated  until  a consistent  hypothesis 
is  found. 


if  the  consistency  checking  process  is  independent  from  hypothesis 
it  should  be  possible  to  eliminate  consistency  checking  from 
generation  by  instructing  subjects  to  respond  with  the  first 


retrieval , 
hypothesis 
retrieved 


hypothesis  suggested  by  a set  of  data  irrespective  of  its  consistency  ("first 


subjects  given  such  instructions  could  then  be  compared  to  data  fro*  subjects 


instructed  to  generate  consistent  hypotheses  ("consistent  hypothesis  retrieval 


condition").  Ue  predict  that  subjects  instructed  to  generate  consistent 


hypotheses  will  per  for*  a hypothesis  generation  task  slower  than  subjects 


instructed  to  respond  with  the  first  retrieved  hypothesis.  This  tine  difference 


should  be  due  to  the  occurence  of  the  additional  retrievals  and  consistency 


checking  required  to  generate  a consistent  hypothesis.  In  addition,  it 


predicted  that  this  tine  difference  will  increase  as  a function  of  the  nunber 


of  data  presented  in  the  decision  problem  (data  set  size).  This  is  expected  to 


occur  for  two  reasons.  First,  as  data  set  size  increases  the  average  nuwber  of 


data  which  are  checked  for  consistency  will  increase.  Thus,  the  amount  of  ti«e 


used  for  consistency  checking  will  increase  with  data  set  size,  adding 


disproportionately  to  the  tine  needed  to  retrieve  a hypothesis  frcm  nenory. 


Secondly,  the  generation  of  consistent  hypotheses  nay  require  the  retrieval  of 


several  hypotheses,  some  of  which  will  be  discarded  as  a result  of  consistency 


checking.  Since  we  believe  that  hypothesis  retrieval  involves  only  part  of  the 


available  data,  it  is  expected  that  the  probability  of  generating  a consistent 


hypothesis  on  the  first  retrieval  attempt  will  decrease  as  a function  of  data 


set  size.  As  data  set  size  increases  none  retrievals  will  be  necessary  to 


generate  a consistent  hypothesis  and  these  will  add  disproportionately  to  the 


tine  needed  to  retrieve  a hypothesis  as  set  size  increases.  For  this  sane 


reason  we  expect  the  n outer  of  errors  to  increase  as  a function  of  set  size  in 


the  "first  hypothesis"  retrieval  condition,  but  not  in  the  "consistent 


hypothesis"  condition.  Thus,  we  expect  an  interaction  between  these 


instruction  conditions  and  data  set  sire  for  both  F: f and  error  data 


Regression  analys 


predicting  hypothesis  generation  reaction  tine  (RT)  as  a 


function  of  set  sire  can  be  performed  for  both  the  “first  hypothesis 


consistent  hypothesis"  instruction  conditions.  The  slopes  of  the  best-fitting 


hypothesis”  retrieval  instruction  condition  uill  be  less  than  the  slope  of  the 


consistent  hypothesis"  retrieval  instruction  condition.  The  difference  in  the 


lopes  of  these  lines  will  reflect  how  ouch  tide  is  required  to  perforw  the 


additional  retrievals  and  consistency  checking  as  set  sire  increases  and  can  be 


used  as  a crude  estination  of  the  additional  tine  required  for  these  processe 


The  second  najor  consistency  checking  prediction  is  that  it  should  be  a nore 


rapid  process  than  hypothesis  generation.  In  a semantic  network  nodel  of  nenory 


(Collins  and  Loftus,  1975),  concepts  are  represented  as  nodes  interconnected  by 


relational  pathways.  Uhen  a hypothesis  is  retrieved  fron  such  a network 


activation  is  assumed  to  spread  froo  both  the  nodes  representing  the  general 


hypothesis  category  and  fro«  the  data  until  several  of  these  sources  of 


activation  neet  at  a hypothesis  node.  This  activated  node  would  then  become  a 


potential  hypothesis.  Thus,  hypothesis  retrieval  way  involve  the  activation  of 


potential  hypothesis  nodes  by  relational  paths  leading  fron  the  data  and 


general  hypothesis  category  nodes.  However 


in  consistency  checking,  the 


hypothesis  is  already  available  in  nenory  and  only  the  relation  pathways 


leading  to  the  data  nodes  eust  be  activated.  Thus,  consistency  checking  is 


assuned  to  involve  only  the  activation  of  relational  information.  From  this  it 


follows  that  consistency  checking  should  occur  at  a faster  rate  than  hypothesis 


generate  a consistent  hypothesis  (hypothesis  generation  taste)  can  be  compared 


to  the  tine  required  to  perform  a task  which  is  analogous  to  the  consistency 


initial  presentation  of  a hypothesis,  followed  by  the  presentation  of  a data 


set.  The  subject  then  must  decide  if  the  hypothesis  is  consistent  with  the 


data.  This  task  can  be  considered  analogous  to  consistency  checking  since  it 


eliminates  the  hypothesis  retrieval  process  and  involves  only  the  verification 


of  relational  information  between  the  hypothesis  and  data,  lie  expect  that  the 


consistency  checking  task  will  be  performed  faster  than  hypothesis  generation, 


because  the  hypothesis  generation  task  requires  the  additional  retrieval 


lat  the  time  difference  between  these  two 


process 


tasks  will  increase  as  a function  of  data  set  size  since  the  average  number  of 


retrievals  involved  in  the  generation  of  consistent  hypotheses  is  thought  to 


follow  the  same  function.  Thus,  we  predict  a task  by  data  set  size  interaction 


Regression  analyses  predicting  hypothesis  generation  and  consistency  checking 


RT  as  a function  of  set  size  can  also  be  performed.  It  is  expected  that  the 


slope  of  the  t-est  fitting  regression  line  for  the  consistency  checking  task 


will  considerably  less  than  the  slope  of  the  hypothesis  generation  task.  The 


difference  between  these  slopes  can  also  be  used  as  a crude  estimation  of  the 


additional  time  needed  for  retrieval  of  a consistent  hypothesis  since  the 


hypothesis  generation  task,  involves  hypothesis  retrieval  while  the  consistency 


The  final  prediction  about  consistency  checking  is  that  it  is  a 


self-terminating  process.  If  a disconfirming  relationship  is  found  between  a 
potential  hypothesis  and  a single  datum,  the  consistency  checking  process  will 


stop.  This  prediction  is  based  on  the  efficiency  of  this  type  of  search.  If  a 


subject  encounters  a datum  which  is  inconsistent  with  a hypothesis,  the 


hypothesis  is  rendered  implausible  and  it  is  useless  to  continue  to  verify  it 


with  the  remaining  data.  This  prediction  can  be  examined  using  the  sane 


consistency  checking  task  as  was  used  to  estinate  the  rate  of  hypothesis 


retrieval.  If  consistency  checking  terminates  upon  encountering  a disconfining 


hypothesis,  then  the  latency  to  render  a hypothesis  implausible  should  increase 


as  the  ordinal  position  of  the  first  disconfirming  datum  is  increased  in  a set 


Three  experiments  were  performed  to  verify  these  three  major  predictions 


Experiment  1 involved  an  instructional  manipulation  where  subjects  either 


generated  the  first  hypothesis  suggested  by  a set  of  data  or  a hypothesis  which 


Experiment  2 involved  a task 


sane 


manipulation  in  which  subjects  either  generated  hypotheses  or  checked  them  for 


consistency.  Finally,  Experiment  3 was  designed  to  test  the  self-terminating 


assumption  by  manipulating  the  position  of  a disconfirming  datum  within  a set 


scaling  study 


An  initial  scaling  study  was  performed  to  select  a roughly  homogeneous  set  of 
hypothesis  generation  problems  to  be  used  in  experiments  1 and  2. 
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Method 

Materials.  A total  of  100  animal  hypothesis  generation  problems  were  used  as 
Materials.  All  problems  consisted  of  charactersictics  normally  associated  with 
different  animals.  The  data  included  such  iteMS  as  Mode  of  locomotion,  types  of 
appendages,  native  continent,  food  sources,  color,  and  s ice . Twenty-five 
problems  were  included  in  each  set  size  <1,  2,  3,  and  A data).  All  the  data 
were  selected  by  the  senior  author  to  suggest  a fairly  large  nunber  of  animals. 
Thus  each  problem  could  have  several  correct  answers. 

Subjects.  Twenty-four  University  of  Oklahoma  introductory  psychology  students 
served  as  subjects  for  class  credit  and  were  run  in  two  groups  of  12 
subjects. 

Procedure^  Subjects  were  presented  the  hypothesis  generation  problems  one  at 
a tine  by  an  overhead  projector.  The  order  of  the  problems  uas  not  randomized. 
The  one  datum  problems  were  presented  first,  followed  successively  by  the  two, 
three,  and  four  data  problems.  The  problems  were  presented  on  a screen  and 
subjects  were  given  60  seconds  to  write  as  many  animals  as  they  could  which 
were  consistent  with  all  the  data  presented  on  that  trial.  After  60  seconds  the 
experimenter  stopped  the  group  and  moved  to  the  next  problem.  Subjects  were 
given  a five  minute  break  half  way  through  the  procedure. 

Analysis.  The  total  number  of  generated  hypotheses,  the  total  number  of 
unique  hypotheses,  and  the  percentage  of  correct  and  incorrect  hypotheses  were 
tabulated  for  each  of  the  100  problems  used  in  the  study.  These  results  were 


levels  (first  hypothesis  vs.  consistent  hypothesis).  Data  set  size  was 


Manipulated  as  a ut  thm-subjects  variable  with  four  levels  (1 


data)  and  problems  were  nested  within  each  level  of  set  sire.  Equal  nunbers  of 


wale  and  fenale  subjects  were  included  within  each  retrieval  instruction 


condition.  Thus,  sex  was  treated  as  an  additional  blocking  variable  in  the 


design.  Perfornance  was  neasured  by  reaction  tine  (RT)  and  error  rate 


Reaction  tiMe  was  defined  as  the  tine  required  for  a subject  to  generate  a 


hypothesis  following  data  onset.  Errors  were  neasured  by  the  correctness  of  the 


generated  hypotheses  in  light  of  the  data  for  a particular 


Hyi50th<?sis  generation  problgns.  Forty-eight  problens  were  used  as  stinuli  for 


the  generation  of  hypotheses.  All  problens  consisted  of  characteristics 


nornnlly  associated  with  different  amnals.  Twelve  problens  of  each  set  size 


3,  and  4 data)  were  selected  fron  those  presented  in  the  prelininary 


scaling  study  on  the  basis  of  the  percentage  of  correct  responses  given  for  a 


particular  problen.  To  nininize  the  lumber  of  incorrect  hypotheses  nade  in  the 


present  study,  the  problens  with  the  twelve  highest  percentage  of  correct 


responses  were  chosen  for  each  set  size.  Overall,  the  percentage  of  correct 


In  addition,  there  were  four  practice  problens  of  each  set  size  which  consisted 


of  products  and  industries  for  which  different  States  are  noted.  The  products 


and  industries  were  selected  to  suggest  the  nanes  of  states  which  had  a large 


variety  of  keyboard  characters  in  their  nanes  to  famlianze  subjects  with  the 


location  of  as  nany  letters  on  the  keyboard  as  possible  before  the  presentation 


of  the  exper mental  problens 


10 


Jwslcyctions.  Subjects'  generated  hypotheses  under  instructions  to  either 
respond  with  the  first  hypothesis  which  was  suggested  by  a set  of  data  or  to 
respond  uith  a hypothesis  which  was  consistent  uith  all  of  the  data  for  a 
particular  problen.  In  the  "first  hypothesis"  retrieval  condition,  subjects 
were  told  to  read  all  of  the  data  presented  on  a given  problen  and  then  respond 
with  the  first  hypothesis  which  occured  to  then  without  regard  for  its 
correctness  or  plausibility.  In  contrast,  subjects  assigned  to  the  "consistent 
hypothesis"  retrieval  condition  were  told  to  read  the  data  presented  on  a given 
problen  and  then  respond  with  a hypothesis  which  was  consistent  with  all  of  the 
data.  In  the  practice  problems,  a consistent  hypothesis  was  defined  as  a state 
which  was  Known  for  all  of  the  product  and  industry  data  presented  on  a given 
trial.  In  the  experimental  problems,  a consistent  hypothesis  was  an  animal  name 
which  had  all  of  the  animal  characteristic  data  presented  on  a given  problem. 
In  addition,  subjects  in  the  consistent  hypothesis  condition  were  told  to 
generate  specific  rather  than  general  animal  names.  This  was  done  because 
higher  order  aninal  classes  (i.e.  bird)  were  usually  not  consistent  with  all  of 
the  data  in  most  of  the  problems. 


In  both  instruction  conditions,  subjects'  were  told  that  they  were  being  timed 
and  were  given  accuracy  instructions  in  regard  to  the  speed-accuracy  trade-off 

(Parhelia,  1974).  In  the  first  hypothesis  condition,  accuracy  w?ts  defined  as 

*» 

responding  with  the  first  hypothesis  which  was  suggested  by  the  data,  while  in 
the  consistent  hypothesis  condition,  accuracy  was  defined  as  responding  with  a 
hypothesis  which  was  consistent  with  all  of  the  data. 


cnrrr-  — 


Fvocedyre.  Upon  entering  the  laboratory,  subjects  were  seated  at  a Compucolo 


Model  8001  microcomputer  which  presented  the  entire  experiment  except  tor 


and  collected  all  responses.  First,  the  appropriate  retrieval 


instructions  were  given  in  the  context  of  the  State  practice  problems.  Once  the 


instructions  were  understood  by  the  subject,  the  16  practice  problems  were 


presented  in  the  sane  order  for  all  subjects.  Uhen  these  were  completed,  the 


instructions  were  repeated  in  the  context  of  the  experimental  animal  problems 


and  these  43  problems  were  presented  in  a random  order 


Both  the  practice  and  experimental  problems  were  presented  in  a similar  manner 


was  printed  in  the  center  of  the  comp>uter  screen  so  that  it 


could  be  read  from  left  to  right.  At  this  time  a software  clock  started  in  the 


computer.  Subjects  were  instructed  to  type  their  responses  as  soon  ns  they 


thought  of  an  appropriate  hypothesis,  and  the  first  keystrike  of  their  response 


stopped  the  clock  and  measured  the  latency  of  hypothesis  generation.  All 


subjects  were  forced  to  give  an  answer  to  all  problems.  Once  the  entire 


response  had  been  typed  and  any  spelling  errors  corrected,  the  subject  pressed 


the  "shift"  key  to  advance  the  program  to  the  next  problem.  This  had  the  effect 


of  erasing  the  screen  and  producing  a 1.5  second  delay  before  the  next  series 


of  data  was  presented.  At  no  time  did  subjects  receive  feedback  concerning  the 


correctness  of  their  responses 


Subjects.  Forty-eight  University  of  Oklahoma  introductory  psychology  students 


Limited  typing  skills  were  required  of  all  subjects  who 


participated.  Subjects  were  randomly  assigned  to  one  of  the  retrieval 


i 


conditions  upon  enter in3  the  laboratory.  The  data  of  an  additional  1?  subjects 


were  discarded  because  of  equipment  failure,  the  inability  to  type,  or  because 


Results  ftivi  Discussion 


An  ANOVA  was  performed  on  the  trimmed  means  for  each  set  sire  of  a subject 


latency  data.  Any  individual  Ri  was  excluded  from  these  means  if  it  was  above 


standard  deviations  from  the  mean  of  all  RTs  within  that  set  size 


This  cutoff  criterion  was  chosen  because  it  usually  eliminated  only  extreme 


outlying  latencies.  This  was  necessary  because  subjects  occasionally  "drew  a 


blank"  and  had  latencies  longer  than  two  minutes.  The  analysi 


resulted  in  the 


expected  main  effects  of  instructions,  F<1,44)  = 8.43,  fISe  * 10.38,  p < .01 


However,  the 


The  means  of  both 


Insert  Table  1 about  here 


Another  ANOVA  was  performed  on  the  means  of  the  correctly  answered  problems  of 


et  size.  These  means  were  used  so  the  results  of  the  present  experiment 


could  be  compared  to  equivalent  data  obtained  in  experiment  2.  Again,  there 


0 5,  and  set  size,  F(3,132)  ~ 47.94,  ttSe  - 5.99,  p < .001,  and  the  instruction 


by  set  size  interaction  was  not  significant.  Regression  analyses  were  performed 


I 


TABLE  1 

triuhed  hean  reaction  time  in  seconds  as  a function  of 

INSTRUCTIONS  AND  SET  SIZE  IN  EXPERIMENT  1 


SET  SIZE 

INSTRUCTIONS 

1 

2 3 

4 

FIRST  HYPOTHESIS 

3.13 

3.92 

6.76 

6.3S 

CONSISTENT  HYPOTHESIS 

4.08 

5.43 

8.97 

8.26 

on  these  wean 


predicting  KT  as  a function  of  set  si.ce.  The  resulting 


the  first  and  consistent  hypothesis 


were 


conditions,  respectively.  The  slopes  obtained  fro*  these  analyses  were  then 


used  to  estmate  the  extra  tine  required  by  consistency  checking.  The  slope 


obtained  for  the  "first  hypothesis"  condition  was  1.49  second/datun  while  the 


slope  of  the  “consistent  hypothesis"  condition  was  1.35  second/datun.  The 


difference,  .36  second/da tun,  is  an  estimate  of  the  extr 


of  the  correctly  answered 


nean 


problems  for  both  instruction  conditions  and  all  set  sices  tire  presented  in 


Insert  Table  2 about  here 


These  neans  are  also  presented  graphically  along  with  the  best-fitting 


Insert  Figure  1 about  here 


The  significant  set  sice  effect  is  consistent  with  the  results  of  Graesser  and 


Mandler  (1978)  who  found  that  the  tine  required  to  generate  the  none  of  a 


•dimension  which  is  comion  to  a set  of  words  increases  as  a function  of  set 


TABLE  2 

HEAN  REACT  IC«  TIME  IN  SECONDS  OF  CORRECT  HYPOTHESES 
AS  A FUNCTION  OF  INSTRUCTIONS  AND  SET  SIZE  IN  EXPERIMENT  1 


INSTRUCTIONS 

1 

SET 

2 

SIZE 

3 

4 

FIRST  HYPOTHESIS 

3.42 

4.41 

7.75 

7.28 

CONSISTENT  HYPOTHESIS 

a .74 

5» . 57 

10.42 

9.30 




A- CONSIST  Si  ‘11  POT 
□--FIRST  fit  POTMfc'S!  S 


A~tO  POTMw  SI  S CfNi-RMTION 
□-CONS  ! ST  fc.NCT  CHECKING 


Figure  1.  Mean  reaction  time  of  correctly  answered  problems.  A)  Kxperimen 
for  the  "first"  and  "consistent"  instruction  conditions,  and  B)  Experiment 
for  the  hypothesis  generation  and  consistency  checking  tasks.  Also  shown  a 
the  best-fitting  regression  lines  for  both  instruction  and  task  conditions 
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size.  However,  the  present  results  show  that  RT  does  not  increase  Monotoniccil  ly 
with  set  size  since  the  nean  of  set  size  four  is  actually  lower  than  that  of 
set  size  three  for  both  instruction  conditions.  This  uas  probably  due  to  the 
specific  three-data  problews  used  m the  experinent.  Our  conjecture  about  this 
finding  is  that  the  three-data  problems  were  generally  wore  difficult  for  our 
subjects  than  the  four-data  problems.  This  was  an  unfortunate  result  of  not 
being  able  to  use  randomly  selected  data  for  each  subject.  Fixed  data  sets  were 
used  to  avoid  the  possibility  that  the  sane  data  would  be  presented  to  the  sane 
subject  twice  and  also  because  sgmo  randon  selections  of  amnal  characteristics 
would  have  no  correct  answers  <i.e.  has  wings,  has  four  legs),  or  have  correct 
answers  which  a typical  subject  would  not  Know. 

The  significant  instruction  effect  supports  the  prediction  that  consistency 
checking  occurs  during  the  hypothesis  retrieval  process.  However,  the  failure 
to  find  a significant  instructions  by  set  size  interaction  is  not  consistent 
with  the  prediction  that  disproportionately  More  retrievals  and  thus  wore 
consistency  checking  will  occur  as  set  size  increases.  However,  another  result 
was  obtained  which  is  consistent  with  our  prediction;  the  analysis  performed  on 
the  nunber  of  errors  Made  within  each  set  size  condition  per  subject  resulted 
in  a significant  set  size  effect,  F(3,1i2)  = 8.15,  M5e  = 1.06,  p < .001.  The 
Mean  nunber  of  errors  for  set  sizes  1 through  4 were  1.25,  1.75,  1.7?,  and 
2.29,  respectively.  However,  the  instructions  by  set  size  interaction  was  not 
significant. 


This  result  is  consistent  with  the  idea  that  hypotheses  are  retrieved  using 
only  part  of  the  data.  If  hypothesis  retrieval  is  based  upon  all  of  the  data, 
irregardless  of  set  size,  then  a constant  error  rate  would  be  expected  across 


A 
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all  set  sizes.  However,  if  retrieval  is  based  on  only  part  of  the  data  then  it 
would  be  expected  that  errors  should  increase  as  a function  of  set  size. 
Gettys,  Fisher,  and  Mehle  <19/8)  estimated  that  the  lumber  of  data  fro«  which  a 
hypothesis  is  retrieved  increases  disproportionately  slower  than  data  set  size. 
This  deans  that  the  probability  of  a retrieved  hypothesis  being  consistent  with 
all  of  the  data  will  become  s«aller  as  data  set  size  increases.  Since  the 
probability  of  an  error  is  a positive  function  of  the  retrieved  hypothesis 
being  inconsistent  with  part  of  the  data,  the  number  of  errors  should  increase 
with  set  size. 

In  addition,  More  errors  were  nade  by  the  first  hypothesis  condition  (1.99) 
than  by  the  consistent  hypothesis  condition  (1.55),  but  this  difference  did  not 
attain  traditional  levels  of  significance,  F(1,44)  = 2.96,  M3e  = 3.09,  p < .10. 
This  trend  in  the  nenns  also  is  consistent  with  our  prediction;  subjects  who  do 
not  consistency  check  should  generate  More  hypotheses  which  are  inconsistent 
with  all  of  the  data.  The  error  data  is  consistent  with  the  prediction  that 
"first  hypothesis"  condition  subjects  would  have  wore  errors  as  set  size 
increased,  but  is  not  consistent  with  the  prediction  that  “consistent 
hypothesis"  subjects  would  have  a constant  lumber  of  errors  across  set  size. 
Evidently,  subjects  in  the  "consistent  hypothesis"  condition  did  not  always 
check  their  answers  for  consistency,  or  did  not  have  enough  knowledge  to 
generate  consistent  hypotheses. 

The  overall  low  error  rate  between  the  "first"  and  “consistent"  instruction 

•» 

conditions  also  indicates  that  subjects  in  the  "first  hypothesis"  condition 
probably  engaged  in  some  plausibility  assessment  before  emitting  their 


not  completely  successful  in 


either  producing  nr  eliminating  the  consistency  checking  process.  The  failure 


to  find  a significant  interaction  between  instructions  and  set  size  with  the  RT 


Manipulation 


E-sami  nation  of  possible  artifacts.  Another  possible  explanation  of  the 


in  the  "first  hypothesis"  condition  tended 


to  repeat  answers  wore  often  than  in  the  "consistent  hypothesis"  condition 


Collins  and  Loftus  (1975)  predict  that  a previously  activated  concept  will  be 


relatively  easier  to  reactivate  than  a non-ac ti vated  concept.  This  would  lead 


to  the  prediction  that  a previously  generated  hypothesis  would  requir. 


tine  to  generate  than  a hypothesis  which  has  been  retrieved  for  the  first  tiMe 


Loftus  and  Loftus  (1974)  and  Loftus  (1773)  have  found  that  repeated  retrieval 


f ron  the  a category  result  in  faster  latencies  than  the  first  retrieval  froM 


the  sane  category.  To  test  this  prediction,  the  number  of  repeated  hypotheses 


were  found  for  each  subject  and  entered  into  an  ANOVA.  The  results  showed  that 


females  < 1 3 . 37 > Made  significantly  more  repetitions  than  Males  (10.54),  F(1,44) 


5.01,  MSe  = 19.2,  p < .05,  but  the  instruction  effect  was  not  significant 


suggesting  that  response  repetition  does  not  account  for  the  observed 


instructions  effect 


nother  possible  explanation  for  the  instruction  effect  is  that  the 


instructions  given  to  the  "consistent  hypothesi 


subjects  asked  for 


animal  names  rather  than  general  categories  of  annuls  while  no  such 


restriction  was  inposed  on  the  "first  hypothesis"  subjects.  The  retrieval  of 


pecific  aninal  names  May  involve  a More  extensive  MeMory  search  them  the 
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retrieved  of  general  categories  of  animal  (e.g.  bird,  fish,  snake,  etc.).  If 
this  is  true  then  it  would  be  expected  that  the  "consistent  hypothesis" 
condition  would  be  slower  than  the  “first  hypothesis"  condition.  First,  the 
number  of  general  animal  names  given  by  each  subject  in  both  instruction 
conditions  were  counted.  An  ANOVA  performed  on  these  data  resulted  in  a 
significant  effect  of  instructions,  F<1,44)  = 10.64,  MSe  - 9.05,  p < .01.  The 
mean  number  of  general  category  names  for  the  "first  and  “consistent" 
hypothesis  conditions  were  3.25  and  5.42,  respectively.  In  addition,  males 
<5.58)  produced  significantly  fewer  general  names  than  females  (3.08),  F(1,44) 
= 8.28,  (ISe  = 9.05,  p •<  .01.  If  the  instruction  effect  on  RT  was  due  to  this 
difference  in  the  number  of  general  names,  then  it  would  be  expected  that  there 
would  be  a high  correlation  between  the  number  of  general  names  emitted  and  the 
mean  RT  for  each  subject.  However,  the  correlation  between  these  measures  was 
-.210  which  is  not  convincing  evidence  for  this  argument. 

Therefore,  the  instruction  effect  can  tentatively  be  explained  by  the  operation 
of  consistency  checking  process,  although  the  failure  to  find  a significant 
instructions  by  set  size  interaction  was  disquieting.  As  previously  mentioned 
this  was  probably  due  to  the  weak  effect  of  instructions.  In  addition,  there 
may  have  not  been  large  enough  data  set  sizes  in  the  present  experiment  for  the 
interaction  to  be  manifest.  In  spite  oF  this  failure,  the  instruction 
manipulation  did  produce  results  predicted  by  the  operation  of  consistency 
checking  in  hypothesis  generation. 


This  design  was  a "g roup -yokel"  design.  This  design  was  termed  "group-yoked 


because  the  hypotheses  generated  by  the  subjects  in  the  hypothesis  generation 


, objects  in  the  consistency  checking  task.  Thus  the 


were 


subjects  in  the  consistency  checking  task  ware  yoked  to  the  correct  responses 


given  by  subjects  in  the  hypothesis  generation  task.  This  yoking  could  not  be 


done  at  the  individual  level  because  a typical  hypothesis  generation  subject 


does  not  always  give  correct  unsuers  to  the  problems.  To  eliminate  this 


problem,  all  of  the  correct  answers  for  a particular  hypothesis  generation 


problem  were  pooled  and  presented  to  consistency  checking  subjects  in  the  same 


proportions  as  th»y  were  originally  generated.  Equal  numbers  of  male  and  female 


were  included  in  both  task  conditions  and  served  as  an  additional 


blocking  variable  in  the  design.  Performance  was  measured  by  the  latency  to 


either  generate  a hypothesis  or  check  a hypothesis  against  a set  of  data  in  the 


hypothesis  generation  and  consistency  checking  tasks 


Forty-eight  University  of  Oklahoma  introductory  psychology  students 


served  as  subjects  for  class  credit.  The  data  of  an  additional  20  subjects 


were  discarded  beacause  of  equipment  failure,  poor  typing  skills,  or  because 


the  subject  was  not  a native  speaker  of  English 


Material 


The  same  48  hypothesis  generation  problems  used  in  experiment  1 


were  also  used  in  the  present  experiment.  In  the  consistency  checking  task 


there  were  48  "true"  and  48  "false"  problems.  The  same  animal  characteristic 


data  used  in  experiment  I were  also  used  in  the  "true"  consistency  checking 


problems.  The  hypotheses  presented  in  the  "true"  consistency  checking  problems 


were  the  correct  hypotheses  generated  by  subjects  who  completed  the  hypothesis 


generation  task  using  the  same  data,  Bifferent  correct  hypotheses  generated  by 


the  hypothesis  generation 


were  presented  to  consistency  checking 


objects  in  approximately  the  same  proportions  as  they  had  been  emitted  in  the 


hypothesis  generation  task 


An  additional  -18  "false 


were  constructed  to  make 


no'  responses  equally  probable  in  the  consistency  checking  task 


Both  the  data  and  hypotheses  used  in  these  "false"  problems  were  selected  by 


the  experimenter.  Twelve  "false"  problems  were  used  for  each  data  set  size  and 


one  datum  was  chosen  to  be  inconsistent  with  the  hypothesis  used  in  each 


problem.  In  the  case  of  multiple  data  problems  the  position  of  the 


disconf irming  datum  was  counterbalanced  across  different 


Procodurg..  Since  the  hypotheses  which  were  checked  for  consistency  were 


generated  by  subjects  in  the  hypothesis  generation  task,  it  was  necessary  to 


run  the  hypothesis  generation  task,  before  the  consistency  checking  task.  The 


procedure  used  in  the  hypothesis  generation  task,  was  identic.il  to  that  used  in 


the  consistent  hypothesis  retrieval  condition  in  experiment  1 


Subjects  in  the  consistency  checking  task,  were  seated  at  a Compucolor  model 


8001  microcomputer  which  presented  the  entire  experiment  except  for 


instructions  and  also  recorded  all  responses.  Then  14  practice  problems  were 


presented  which  involved  the  same  data  as  used  in  experiment  1,  followed  by  the 


9 L experimental  animal  problems.  In  both  the  practice  experimental  problems 


subjects  were  told  to  make  a 'yes'  response  if  the  presented  hypothesis  was 


any  one  datum  was 


a or  a no'  response 


inconsistent  with  the  hypothesis 


A consistency  checking  trial  began  with  the  presentation  of  a hypothesis  which 


was  a State  name  in  the  practice  problems  and  an  animal  name  in  the 


experimental  problems.  This  hypothesis  remaned  on  the  screen  until  the 


subject  pressed  the  'space  bar'  on  the  computer  keyboard.  At  that  time  the 


hypothesis  was  erased,  followed  by  a 1.5  second  delay  and  the  presentation  of 


the  data.  The  subject  then  pressed  either  the  "2“  or  key  to  indicate  either 


a "yes“  or  "no"  response  and  the  position  of  the  "yes"  and  "no“  keys  were 


counterbalanced  across  subjects.  These  keys  were  located  on  the  bottom  row  of 


the  computer  keyboard  and  were  chosen  as  response  keys  because  they  were  widely 


eparated  and  subjects  could  easily  keep  their  finders  poised  above  these  two 


Uhen  a response  was  node,  the  software  clock  stopped,  and  the 


buttons 


response  was  printed  on  the  screen  beneath  the  data  and  renamed  there  for  a 1 


second  interval.  Finally,  the  screen  was  cleared  and  the  next  hypothesis  was 


presented.  This  procedure  was  repeated  until  all  of  the  problens  had  been 


Results  and  Discussion 


An  ANOOA  was  performed  using  the  mean  latencies  of  the  correctly  answered 


hypothesis  generation  and  "true”  consistency  chucking  problems 


effects  of  task 


0001,  were  significant 


task  by  set  sire  interaction  was  significant,  F < 3 , 1 32 ) 


0001.  Regression  analyses  performed  on  the  two  task  conditions  predicting  RT 


cis  a function  of  set  size  produced  correlations  of  .933  and  .938  for  the 


hypothesis  generation  and  consistency  checking  tasks,  respectively.  The  slopes 


the  hypothesis  generation  and  consistency  checking  tasks,  respectively 


difference  between  these  two  slopes  was  1.794.  fhe  Mean  RT  s for  both  task 


conditions  across  set  size  ore  presented  in  Table  3 


Insert  Table  3 about  here 


The  sane  Mean  RT's  along  with  their  respective  best-fitting  regression  lines 


are  presented  in  Figure  1b 


By  conparing  Figures  la  and  1b,  it  can  he  seen  that  the  y-intercept  is 


noticably  greater  for  the  hypothesis  generation  task  obtained  in  experinent  2 


than  for  the  identical  "consistent  hypothesis"  condition  in  e::pennent  1.  This 


difference  nay  have  been  due  to  a difference  in  typing  skills  between  the 


ubjects  in  the  two  experinents.  Subjects  were  allowed  to  sign  up  for 


experinent  2 without  having  typing  skills  and  only  those  subjects  which  had  a 


great  difficulty  in  finding  key  locations  were  discarded.  However,  in 


experinent  1,  subjects  were  not  ‘allowed  to  sign  up  unless  they  were  famliar 


The  significant  task  effect  denonstrates  that  a consislentcv  checking  task  is 


perforned  none  rapidly  than  a hypothesis  generation  task.  The  large  difference 


TABLE  3 

MEAN  REACTION  TINE  IN  SECONDS  OF  CORRECT  HYPOTHESES 
AS  A FUNCTION  OF  TASK  AND  SET  SIZE  IN  EXPERIMENT  2 


CONSISTENCY  CHECKING 


HYPOTHESIS  GENERATION 
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in  the  y-intercepts  between  the  tasks  is  due  to  the  different  responses  Made  by 
the  two  groups.  The  hypothesis  generation  task  involved  many  different  keys  on 
the  keyboard,  while  the  consistency  checking  task  involved  only  two.  Despite 
this  difference  in  y-intercepts,  it  can  be  seen  that  the  slope  of  the  best 
fitting  regression  line  of  the  consistency  checking  task  is  considerably  lower 
than  that  of  the  hypothesis  generation  task.  This  indicates  that  the  amount  of 
tine  needed  to  process  one  additional  datum  was  considerably  greater  for  the 
hypothesis  generation  task  than  the  consistency  checking  task.  This  result 
provides  support  for  the  prediction  that  consistency  checking  is  a wore  rapid 
process  than  hypothesis  generation.  In  addition,  the  crude  estimate  of  the 
additional  tine  required  for  consistency  checking  obtained  in  experiment  1 
(.361  second/datum)  is  also  considerably  less  than  the  estimate  of  the 
additional  time  requires  tor  hypothesis  retrieval  obtained  in  the  present 
experiment  (1.798  second/datum).  This  difference  in  estimates  is  also 
consistent  with  the  prediction  that  consistency  checking  is  a high-speed 
verification  process  rather  than  a memory  search  process. 


Experiment  3 

Experiment  3 was  performed  to  test  the  prediction  that,  consistency  checking  is 
a self-terminating  process.  This  was  done  by  varying  the  position  of  a 
disconf  irm  ng  datun  within  three  data  consistency  checking  problems  similar  to 
those  used  in  experiment  2. 

Method 

Design . Experiment  3 was  a within-subjects  design  where  the  independent 


variable  was  the  ordinal  position  of  a disconf irming  datum  within  a series  of 


datum).  Performance  was 


measured  by  the  latency  to  determine  whether  a hypothesis  was  consistent  or 


inconsistent  with  all  of  the  available  data.  Sex  and  "yes-no“  Key  positions 


were  also  included  in  the  design  as  a blocking  and  counterbalancing  variables, 


respectively.  In  addition  the  position  of  the  disconf irming  datum  was 


counterbalanced 


Subjects.  Twenty-four  University  of  Oklahoma  introductory  psychology  students 


served  as  subjects  for  class  credit.  All  were  randomly  assigned  to  the 


counts rba 1 anc i ng  conditions 


Eighteen  practice  and  sixty  experimental  consistency  checking 


problems  served  as  materials.  All  consisted  of  a hypothesis  and  three  data 


The  practice  problems  involved  checking  countries  against  products  and 


industries,  occupations  against  tools,  and  animals  against  characteristics.  The 


experimental  problems  only  involved  checking  occupations  against  tools  and 


animals  against  characteristics.  Uithin  each  problem  type  there  were  15 


problems  in  which  the  hypothesis  was  consistent  with  all  the  data  and  15 


problems  where  one  datum  was  inconsistent  with  the  hypothesis.  Uithin  these 


disconf irning  problems,  there  were  five  problems  with  the  disconfiming  datum  in 


the  first,  second,  and  third  positions 


Procedure.  Upon  entering  the  laboratory,  subjects  were  seated  at  a Compucolor 
model  8001  microcomputer  which  presented  the  entire  experiment  and  recorded  all 


responses.  First,  instructions  were  given  about  the  nature  of  the  consistency 
checking  problems  and  the  the  practice  problems  were  presented.  Then  the 


instructions  were  repeated  and  followed  by  the  experimental  problems.  Both  the 


practice  and  experimental  problems  were  presented  in  the 


subjects.  The  procedure  and  instructions  used  ir.  experiment  3 were  similar  to 


those  used  in  the  consistency  checking  task  used  in  experiment  2.  First,  a 


hypothesis  was  presented  on  the  screen  and  remained  there  until  the  subject 


pressed  the  “space  bar".  This  erased  the  hypothesis  and  following  a 1 second 


delay  the  three  data  were  presented  in  a vertical  list  so  that  they  could  be 


read  from  top  to  bottom.  The  data  remained  on  the  screen  until  the  subject 


pressed  either  the  "Z"  or  '7'*  key  on  the  keyboard,  depending  upon  uhether  their 


response  was  "yes"  or  "no"  and  which  of  these  two  keys  represented  these 


ie  screen  and  remained  there  for  1 


second,  after  which,  the  screen  was  cleared  and  the  next  hypothesis  was 


presented 


:md  Discussion 


A within-sub jects  ANOVA  was  perfoimed  on  the  trimmed  means  obtained  for  each 


type  of  disconf irming  datum  problem  and  the  confirming  problems  for  each 


above  or  below  .75  standard 


deviations  from  the  mean  of  the  distribution  of  all  the  RT's  for  a particular 


problem.  The  results  of  this  analysis  indicated  a significant  effect  of  the 


position  of  the  disconf irming  datum,  F(3,60)  = 10.91,  MSe  = 1943.13,  p < .001 


The  means  of  the  first,  second  and  third  disconf irming  datum  positions  were 


2.78,  3.23,  and  3.49  seconds,  respectively,  while  the  mean  of  the  confirming  or 


was  3.14  second 


Tukey  pairwise  comparisons  indicated  that 


position  2 and  3 problems  were  disconfirmed  significantly  slower  than  position 


30 


i 

I 


1 problems,  t < 1 ,60)  = 3.53  and  5.57,  respectively,  error  tern  - 12.71,  p < .05, 
but  position  2 problens  were  not  disconfirmed  significantly  faster  than 
position  3 problens.  Thus,  as  predicted,  RT  increased  as  a function  of  the 
position  of  the  disconfirnmg  datum.  This  result  is  consistent  with  the 
prediction  that  consistency  checking  is  self-terminating.  If  consistency 
checking  were  an  exhaustive  process  then  subjects  should  continue  checking  a 
hypothesis  after  encountering  a disconf irming  datum.  However,  the  present 
results  suggest  that  subjects  stop  consistency  checking  when  a discontinuing 
relationship  is  found  between  a datum  and  a hypothesis.  The  nonsignificant 
difference  between  the  position  2 and  3 problems,  however,  suggest  that  some 
subjects  did  tend  to  read  the  last  datum,  but  evidently  most  subjects  stopped 
reading  if  the  discontinuing  datum  was  in  the  first  position.  Also  of  interest 
was  a regression  analysis  performed  predicting  RT  as  a function  of  the  position 
of  the  disconfirming  datum.  The  slope  of  the  best  fitting  line  was  .35 
second/datum  which  is  remarkably  close  to  the  .36  second/datum  estimate  of  the 
additional  tine  required  for  consistency  checking  obtained  in  experiment  1. 
This  result  provides  converging  evidence  that  consistency  checking  is  a more 
rapid  process  than  hypothesis  retrieval. 


Summary 

In  summary,  the  results  of  experiment  1 demonstrated  that  subjects  who 
retrieved  and  checked  hypotheses  for  consistency  required  more  time  to  generate 
hypotheses  than  subjects  who  just  retrieved  hypotheses.  This  finding  provides 
evidence  that  consistencv  checking  occurs  in  the  hypothesis  generation  process. 

However,  the  predicted  interaction  between  instructions  and  set  size  was  not 
found.  Ue  believe  this  failure  was  the  result  of  an  ineffective  instructional 


Manipulation.  Evidently 


subjects  in  the  "first  hypothesis"  condition 


inadvertantly  checked  their  responses  for  consistency  since  they  did  not  nak.e 


significantly  nore  errors  than  subjects  in  the  “consistent  hypothesis 


condition.  In  addition,  subjects  in  the  "consistent  hypothesi 


not  always  produce  consistent  hypotheses,  especially  in  the  larger  set  sires 


However,  a sinilar  interaction  was  found  in  experinent  2 uhich  involved  a task 


Manipulation.  This  interaction  denonstrated  that  consistency  checking  is 


perforned  nuch  nore  rapidly  than  hypothesis  generation.  Finally,  experiMent  3 


provided  evidence  that  consistency  checking  is  a self-terninating  process 
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